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Abstract 
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measure, and recurrence quotient for Sturmian words with slope a, have automatic 
continued fraction expansions if a does. 
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1 Introduction 



A sequence (a n ) n >o over a finite alphabet A is said to be k-automatic for 
some integer k > 2 if, roughly speaking, there exists an automaton that, 
on input n in base k, reaches a state with the output a n . More formally, a 
sequence (a n ) n >o over A is fc-automatic if there exists a deterministic finite 
automaton with output (DFAO) M = (Q, A, 5, q , r) where Q is a finite 
set of states, = {0, 1, 2, . . . , k — 1}, 5 : Q x £ fc — > Q is the transition 
function, and r : Q — > A is the output function, such that if w is any base- 
k representation of n, possibly with leading zeroes, then a n = r(5(qo,w R )). 
(Note that a® = r(go)-) Here w R is the reverse of the word w. 

This class of sequences, also called fc-recognizable in the literature, has been 
studied extensively (e.g., [9]) and has several different characterizations, the 
most famous being images (under a coding) of fixed points of fc-uniform mor- 
phisms. 

The archetypal example of a fc-automatic sequence is the Thue-Morse sequence 
t = (t n )n>o = 0110100110010110 

where t n is the sum (modulo 2) of the bits in the base-2 expansion of n [S]. 
See Figured! It can also be viewed as the fixed point of the morphism /i where 
-> 01 and 1 -> 10. 
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Fig. 1. Automaton generating the Thue-Morse sequence 

Given a ^-automatic sequence, one might reasonably inquire as to whether 
the sequence is ultimately periodic. More precisely, we would like to know if 
the problem 

Given a fc-automatic sequence, is it ultimately periodic? 

is decidable (i.e., recursively solvable). This problem was solved by Honkala 
|2U] . who gave a rather complicated decision procedure. 

In this paper, we begin by recalling a technique of Lehr [28J as simplified by 
Allouche and Shallit [9l pp. 380-382]. In Section [2] we introduce it and use it 
to reprove the result of Honkala mentioned above. 
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Another topic of great interest is the pattern-avoiding properties of certain au- 
tomatic sequences. For example, more than a hundred years ago Thue proved 
[3"Tf3"8] that t contains no overlaps, where an overlap is a word of the form 
axaxa, where a is a single letter and x is a word, possibly empty. Examples 
of overlaps include alfalfa in English, entente in French, and ajaja and 
tutut in Finnish. 

Similarly, much attention has been given to avoiding squares. A square is 
a word of the form xx where x is nonempty. Examples of squares include 
murmur in English, chercher in French, and valtavalta in Finnish. A (finite 
or infinite) word is squarefree if it contains no square factor. As is well-known, 
if one counts the lengths of the blocks of l's between consecutive O's in t, one 
obtains the squarefree sequence 

v = K)n>o = 210201210120 

The word v is generated as the fixed point of the morphism g defined by 
2 — ► 210, 1 — ► 20, and — > 1. Furthermore, v is generated by the automaton 
depicted in Figure [2j Here the input is n expressed in base 2, starting with 
the least significant digit, and the output, given by the symbol labeling the 
state, is v n . (Contrast this with the representation given by Berstel [TO].) 








0,1 

Fig. 2. Automaton generating a squarefree sequence 

We can generalize the concept of power to non-integer powers. Let a be a real 
number > 1. We say that a word z is an a-power if it is the shortest prefix of 
length > a\x\ of some infinite word x u = xxx ■ • • , and we say it is an a + -power 
if it is the shortest prefix of length > a\x\ of x^ . For example, the English word 
z = abracadabra is both a 3/2 and a (3/2) + power, as z is a prefix of length 
11 of (abracad) 1 ^, and 10/7 < 3/2 < 11/7. Using this notation, an overlap is 
a 2 + power. We say a (finite or infinite) word z contains an a-power if we can 
write z = uvw where v is an a-power. We say that a (finite or infinite) word z 
avoids a-powers or is a-power-free if it has no factor that is an a-power, and 
similarly for a + -powers. 
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In Section [3] we use Lehr's technique to prove a new result: that it is decidable 
whether a given /c-automatic sequence is squarefree, overlap-free, contains an 
r-power for r rational, contains an r + -power, etc. 

Let a = (a n ) n >o be a sequence over a finite alphabet A. The orbit of a, written 
Orb(a), is the set of all its shifts, that is, the set of sequences {(a n+ j) n >o : 
i > 0}. The orbit closure of a, written Cl(Orb(a)) is the closure of Orb(a) 
under the usual topology where two sequences are close if they agree on a 
long prefix. More transparently, a sequence b = (& n ) n >o is in the orbit closure 
of a if and only if every finite prefix of b is a factor of a [21 Prop. 10.8.9, p. 
327]. 

An infinite word a is said to be recurrent if every finite factor that occurs 
in a occurs infinitely often. It is not hard to see that if a is recurrent and 
not periodic, then Cl(Orb(a)) is uncountable [9, Thm. 10.8.12, p. 328]. If a 
is not recurrent this may not be true; for example, consider the infinite word 
c = abaabaaabaaaab ■ • • . Then Cl(Orb(c)) is countable because once a finite 
factor contains two or more b's, its position in c is fixed and hence can be 
extended in at most one way. Thus Cl(Orb(c)) equals a w U a*ba w U Orb(c), 
and hence is countable. 

In Section H] we are interested in elements in the orbit closure of automatic 
sequences. From the result mentioned above, if a is recurrent, then "most" 
of the sequences in Cl(Orb(a)) cannot be fc-automatic for any k, since the 
orbit closure is uncountable while the set of fc-automatic sequences over A is 
countable. Evidently, this is true even if a itself is not automatic. 

Now suppose that a is fc-automatic, and consider the lexicographically least 
sequence b in Cl(Orb(a)). We show in Section H] that b is also /c-automatic, 
and more generally, any sequence chosen in a periodic way from the factor 
tree of a is also fc-automatic. 



2 Periodicity 

Let a — {cin)n>o be an infinite sequence. Then a is ultimately periodic if there 
exist integers P > 1, N > such that = a i+P for all i > N. 



Theorem 1 Given a DFAO M = (Q, £&, A, 5, q$, t) it is decidable if the k- 
automatic sequence it generates is ultimately periodic. 

As mentioned before, this result is due to Honkala [20] • We give a new proof. 
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Proof. We start with a sketch of the proof. First, we construct an NFA Mi 
that on input (P, N) "guesses" I and accepts if / > N and aj ^ aj+p. We 
now convert M 1 to a DFA M 2 using the usual subset construction, and then 
interchange accepting and non-accepting states, obtaining a DFA M 3 with the 
property that M 3 accepts (P, N) if and only if a/ = a/+p for all I > N. Now 
a is ultimately periodic if and only if M 3 accepts some input, which can be 
checked using the usual depth-first search technique to determine if there is a 
path from M 3 's initial state to a final state. 

We now give the proof in detail, addressing concerns such as exactly how P 
and N are represented, what it means to guess /, how we verify that I > N, 
how we compute I + P, and what if / is significantly larger than P or N. 

When we say that M 1 takes (P, N) as input, what we really mean is that the 
input alphabet of Mi is S fc x E fe , so that Mi takes as input the base-A; digits 
of P and iV in parallel. More precisely, the input is (po, ^o)(pi, ^1) • • • (Pj, n j) 
where rijUj-i ■ ■ ■ hq is a base-A; representation of N and PjPj-i • ■ ■ Po is a base-A; 
representation of P, either or both padded with leading zeros to ensure that 
their lengths are the same. This means that (P, N) can be input in infinitely 
many ways, depending on the number of leading zeros (which are actually 
trailing zeros since we read the input starting with the least significant digit), 
and we must ensure that the correct result is returned in each case. 



When we say we guess /, what we really mean is that we successively guess 
the base-A; digits of /, starting with the least significant digit. 

In order to verify that our guessed / is > N, we maintain a flag that records 
how the number represented by the digits of / seen so far stands in relation 
to the digits of N seen so far: whether it is <, =, or >. The flag is updated 
as follows, if the next digit of I guessed is i' and the next digit of N is n'\ 



u(<, i', n') 



u(=,i',n') = 



u(>, i', n') 





if i' 


< n 


{:: 


if %' 


> n 


<, 


if %' 


< n 




if %' 


= n 




if i> 


> n 




if i> 


< n 


{>. 


if i 1 


> n 



To compute I+P, we maintain a "carry" bit, and compute I+P digit-by-digit 
as we see the digits of P input using the usual pencil-and-paper method. 
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Finally, since we guess the digits of / in parallel with the digits of the inputs 
P and N, we have to address the situation where the b&se-k representation 
of the appropriate / to guess is longer than the representation of the inputs 
P and N. If we do not pad P and N with enough O's, we might return the 
wrong result. To handle this, we modify the acceptance criterion of the NFA 
Mi, making a state accepting if an accepting state could be reached by any 
input of the form (0, 0)- 7 , j > 0. 

We now give the construction in more detail. Suppose M = (Q, £&, A, 5, qo, r) 
is a £>DFAO. We make an NFA M l = (Q f , S fe x E fc , 5', q' , F') as follows. 



Q' = {<,=,>} x {0,1} x Q x Q; 

go = K°>go,?o]; 

F' = {[b,0,q,r] : b E {>,=} and r(q) + r(r)} 



The meaning of a state [b, c, q, r] of Q' is that b is the flag maintaining the 
relationship between / and N; c is the carry bit in the computation of I + P; 
q is the state in M reached by the bits of / seen so far; and r is the state in 
M reached by the bits of I + P calculated so far. 

We define 5 f by S'([b,c,q,r], (n',p')) : = 

{[u(b,i',ri), T +P 1 + ° J , 5(q, i) , 5(r, (i' +p' + c) mod fe)l : < %' < B. 
k 

Here u is the update map defined in Eq. flTJ). 

This finishes the construction of the NFA M\. We now create a new NFA M[ 
that is exactly the same as M%, except that it has a new set of final states F' 
defined by 

F' := {[&, c, q, r] : there exists j > such that 5'([b, c, q, r], (0, 0) j ) G F'}. 



We now convert M[ to a DFA M 2 = (Q", S fe x E fc , 5", F") using the usual 
subset construction. We define M 3 = (Q", S fc x S fc , 5", Q" - F"). It is not 
hard to see that M3 accepts some input (P, N) with P > 1 if and only if a is 
ultimately periodic. This can be checked by creating a DFA M4 that accepts 
— {0})££) x and, using the usual direct product construction, cre- 
ating a DFA M 5 that accepts L(M 3 ) H L(M 4 ). Then a is ultimately periodic 
if and only if M 5 accepts some string, and this can be checked using the usual 
depth-first search to look for a path connecting the initial state with some 
final state. ■ 
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3 Decision problems about repetitions 



A morphism h : X* — > A* is said to be /c-power-free if whenever w is fc-power- 
free, so is h(w). There is a reasonably large literature about these morphisms, 
with most investigators concentrating on giving computable characterizations 
of such morphisms; see, for example, [TTlll5f21f27|l35] . 

We say a morphism h : X* — > S* is prolongable on a letter a if h(a) = ax for 
some x such that 7^ e for all z > 0. In this case there is a unique infinite 
word with prefixes h l (a) for all % > 0, which we write as h u (a). Such a word 
is called morphia. It is also of interest to give computable characterizations 
of those h for which h?(a) avoids various kind of repetitions. (Note that it 
is possible for h u (a) to, for example, avoid squares, even if h itself is not 
squarefree. The morphism g given above in Section [1] provides an example. 
Here 212 is squarefree, but g(212) is not.) 

Berstel [11] showed how to decide if h u (a) is squarefree for three- letter alpha- 
bets. Karhumaki [21] showed how to decide if h u (a) is overlap- free for two- 
letter alphabets. Later, Mignosi and Seebold [31] gave a general algorithm for 
testing the fc-power-freeness of h u (a) for arbitrary non-erasing morphisms h 
and integers k > 2. Cassaigne [13] showed how to test if certain kinds of HDOL 
words avoid arbitrary patterns. 

The technique of Section [2] can be modified to create a decision procedure 
for the existence of many kinds of repetitions in /c-automatic sequences. Our 
approach is both more and less general than previous results in the literature. 
It is less general because our technique works only for uniform morphisms. 
It is more general because (a) it works not only for fixed points of uniform 
morphisms, but also images of those fixed points (under a coding); (b) it works 
for testing the r-power-freeness and r + -power-freeness of words, where r is an 
arbitrary rational number > 1 - a topic relatively unexplored in the literature 
until now (but see [25|26] ); and (c) it works for arbitrary alphabets. We do 
not know how to make our technique work for r an irrational number. 

The following theorem illustrates the technique. 



Theorem 2 The following question is decidable: given a k-automatic sequence 
a = (a„) n >o represented by a DFAO, is a overlap-free? 

Proof. The proof is very similar to the proof of Theorem [lj The sequence 
a = (a n ) n >o contains an overlap if and only if there exist integers J > 0, T > 1 
such that ai + j = a I+T+J for all J, < J < T. 
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Given a DFAO M = (Q, S, A, 5, q , r) for a, we create an NFA M 2 that on 
input (/, T) accepts if there exists an integer J, < J < T, such that a I+J ^ 
aj + T+j- To accomplish this, M 2 guesses the bits of J, verifies that < J < T, 
computes I + J and / + T + J on the fly, and accepts if aj + j ^ a I+T+J . As 
before, we handle the problem that the expansion of I + T + J might be longer 
than that of I or T by allowing inputs with leading zeroes (actually trailing, 
since inputs are entered starting with the least significant digit). To do so, we 
modify the accepting states of M 2 to get a new NFA M 3 , by making a state of 
M 3 accepting if it can be reached in M 2 from an accepting state along a path 
labeled (0, 0)- 7 for some j > 0. 

We now convert M3 to a DFA using the subset construction, and change all 
accepting states to non-accepting and vice versa, obtaining a DFA M4. Hence 
M 4 accepts if for all J with < J < T we have aj + j = a I+T+J ; i.e., there is 
an overlap of length 2T + 1 beginning at position I of a. Thus a contains an 
overlap if and only if M 4 accepts (/, T) for some integers / > and T > 1, 
which, as before, can be easily checked. 

Here are the full details for the construction of M 2 = (Q', x 6', q' , F'). 
The states are 5-tuples of the form [b,c,d,q,r] where b is one of <,=, or >, 
expressing the relationship between the guessed J and the input T; c is the 
carry in the computation of J+ J; d is the carry in the computation of I+T+J; 
q is the state of M reached on input I + J; and r is the state of M reached 
on input I + T + J. The initial state is q' = [=, 0, 0, go, Qo], and the set of final 
states is 

F' = {[6,0,0,g,r] : b G {<,=} and r(q) r(r)}. 
Finally, 5' is defined as follows: 

5'([b,c,d, q ,r},(if,t')) = { [u(b,j',t'), [ C + l ' k + 3 ' \, 
L rf + ' / ^ j/ + f J , 6{q, (c+i'+f) mod k), 5{r, (d+i'+f+f) mod k)} : < / < k}. 



Example 3 Using the Grail package [34] . version 3.3.4, we verified purely 
mechanically that the Thue-Morse word t is overlap-free. We carried out the 
construction of Theorem [2] by creating an NFA of 72 states (3 possibilities for 
b, 2 for c, 3 for d (since carries for d + i' + j' + t' could be as much as 2), and 2 
possibilities for each of q and r). We added the correct final states, and then 
converted this to a DFA with 801 states. We then took the complement of this 
DFA, obtaining a DFA that accepts all pairs (I, T) where there is an overlap 
of length 2T + 1 beginning at position I. We then minimized, obtaining a 
DFA with 2 states that only accepts strings corresponding to T = 0. Hence t 
is overlap- free. 
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The same idea can be used to prove each of the following results: 



Theorem 4 Given a DFAO M generating a k-automatic sequence a, each of 
the following properties is decidable: 

(a) Given a rational number r, whether & avoids r-powers (resp., r + -powers); 

(b) Given a rational number r, whether a contains infinitely many occurrences 
of r-powers (resp., r + -powers); 

(c) Given a rational number r, whether a contains infinitely many distinct r- 
powers (resp., r + -powers); 

(d) Given a rational number r, and a length I, whether a avoids x r (resp., r + - 
powers) for \x\ > I; 

(e) Given a rational number r, whether a avoids x r for all sufficiently long x; 

(f) Given a length I, whether a avoids palindromes of length > I (cf. [33]); 

(g) Whether a avoids all sufficiently long palindromes; 

(h) Given a length I, whether a satisfies the property that x is a factor of a of 
length > I, then its reverse x R is not (cf. 1301/ ); 

(i) Assuming a is defined over the alphabet {0, 1, ... ,j — 1}, whether a avoids 
all factors of the form xa(x) where a (a) = (a + 1) mod j (cf. 129]). 

The proofs for each part are more-or-less trivial variations on the proof of 
Theorem [5J and we omit them. However, we do make one remark: for parts 
(a)-(e), we need to replace the condition for the existence of overlaps, namely, 
"there exist I > 0, T > 1 such that aj + j = aj + x+j for all J, < J < T" with 
the appropriate condition for a-powers, where a = | is a rational number. 
The new condition is "there exist / > 0, T > 1 such that aj + j = ai + T+.j for 
all J, < J < — 1)T". (In the case of a + -powers, the inequality becomes 
< J < (- — 1)T.) At first sight it might seem difficult to implement this 
test, for although multiplication can be carried out easily starting with the 
least significant digit, division is more problematic. To handle this, we simply 
rewrite the inequality J < ( £ — \)T as qJ < (p — q)T. Now on input T we can 
guess J digit-by-digit, transduce J into qJ and T into (p — q)T, and verify the 
inequality q J < (p — q)T on the fly starting with the least significant digit, as 
before. 



4 The orbit closure 

We now turn to orbits and the orbit closure of automatic sequences. As mo- 
tivation, recall that a certain classical dynamical system (i.e., a compact set 
together with a continuous map of this set) is associated with any sequence, 
namely the topological closure of the orbit of that sequence under the shift. 
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For some sequences, the lexicographically least and largest sequences in the 
orbit closure are known explicitly. 

Consider, as an example, the Thue-Morse sequence t. The lexicographically 
least sequence in the orbit closure of t is the sequence obtained by iterating 
the Thue-Morse morphism // : — > 01, 1 — > 10 on 1, and then dropping the 
first letter [2.3.5.22J. This gives 

001011001101001 ■ ■ ■ 

and this sequence is clearly 2-automatic, as it is accepted by the DFAO in 
Figure [3] below. 
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Fig. 3. Automaton generating the lexicographically least sequence in the orbit clo- 
sure of the Thue-Morse sequence 

Other examples are discussed in Section [6j Recall that the Rudin-Shapiro 
sequence u = (w„) n > is a 2-automatic sequence defined as follows: u n is or 

I according to whether the number of (possibly overlapping) occurrences of 

II in the binary expansion of n is even or odd. We observe empirically that 
the lexicographically least sequence in the orbit closure of the Rudin-Shapiro 
sequence seems to be the sequence obtained by preceding the Rudin-Shapiro 
sequence by a 0, but we did not yet prove this. 

We now apply the technique of Section [2] to the lexicographically least se- 
quence in the orbit closure of a fc-automatic sequence. Our idea is based on 
the following characterization. 

Lemma 5 Let a = (a n )„> be a sequence, and let b = (6 n ) n >o be the lexico- 
graphically least sequence in the orbit closure of a. Then bi — c if and only if 
there exists j > such that aj + i = c and c^+i ■ • • a; + j > OjOj+i • • ■ %+i for all 
l>0. 

Proof. Suppose bi = c. Then there exists j > such that 

b bi ■ ■ - bi, so dj+i = bi. But then aiai + i ■ ■ ■ cti+i > ctjCij+i ■ ■ ■ aj + i for all I > 0. 

(Here we use > for lexicographic order.) 

On the other hand, if a^+i • • • ai+i > djdj+i ■ ■ ■ aj + i for all I > 0, then 
a j a j+i • • " a j+i must be the prefix of b of length i + and so b { = aj +i = c. ■ 
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The advantage to this characterization of bi is that it does not require explicit 
knowledge of b , bi, . . . , &j_i. 

Theorem 6 Let a be k- automatic, and let h be the lexicographically least 
sequence in the orbit closure of a. Then b is k-automatic. 

Proof. The idea is to use the condition in Lemma The proof is similar to 
the proof of Theorem [TJ and we outline it below. The fine details about how 
everything is computed are similar to those of Theorem [I] and we omit them. 

The proof consists of several steps. First, suppose we have a fc-DFAO M 
generating a. We now create an NFA Ml that on input (L, J, R) accepts if 
and only if there exists t, < t < R, such that ai+t ^ or ol + r > aj + R. 
The idea is to "guess" t bit-by-bit, verify the inequality < t < R, while 
simultaneously computing the quantities L + t, J + t, L + R, and J + R. We 
accept if ai +t ^ aj +t for some t, < t < R, or if ol + r > aj + R. 

From Mi we create a DFA M 2 that on input (L, J, R) accepts if and only if 
ai+t = aj+t for alH, < t < R and ol+r < cij+r- This is done by converting 
Mi to a DFA using the subset construction and changing all accepting states 
to non-accepting and vice versa. Thus M 2 accepts (L, J, R) if and only if 
a LO'L+i ■ ■ ■ ul+r < a jaj+i ■ ■ ■ a J+R- 

Next, from M2 we create an NFA M3 that on input (J, R) accepts if and only 
if there exists an L > such that a^a^\ ■ ■ ■ ol+r < ajaj + i ■ ■ ■ aj + R. The idea 
is to "guess" L bit-by-bit and call M 2 on (L, J,R). A priori L could be very 
big compared to J and R, but our previous trick to handle this works. 

Then from M 3 we create a DFA M 4 that on input (J, R) accepts if and only 
if for all L > we have olOl+i ■ ■ ■ cll+r > a>jaj+i • • • o>j+r- This is done by 
converting M3 to a DFA using the subset construction, and then changing all 
accepting states to non-accepting and vice versa. 

From M4 we create an NFA M5 that on input cl (i.e., the character c con- 
catenated with the base-fc expansion of J) accepts if and only if there exists 
J > with a J+I = c and a L a L+ i ■ ■ ■ a L+I > ajaj + i ■ ■ ■ aj + j for all L > 0. This 
is done by recording c in the state, "guessing" J bit-by-bit, computing J + I 
bit-by-bit and simulating M on J + J, and calling M 4 with input (J, I). We 
then convert M 5 to a DFA M 6 using the subset construction. 

Finally, we create a fc-DFAO M7 that on input / simulates M$ on input cl 
in parallel for each c G A. Exactly one branch will accept, and the output 
associated with this branch is c. ■ 
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5 Continued fraction expansions 



The results of the previous section can be generalized to other kinds of or- 
ders. Instead of the ordinary lexicographic order, we could consider an or- 
der that depends on the index of the string being compared. One way to 
do this is to consider a sequence of permutations (/0i)i>o, where each ipi : 
A — > A, and when comparing a^ai ■ ■ ■ to b^bi ■ ■ ■ we instead compare 
■0o( a o) • ■ ■ ipi-i( a i-i) to ipo(bo) • ■ ■ ipi-i(bi-i) (using the ordinary lexicographic 
order). An example of this kind of ordering comes from continued fractions, 
where [a , a 1; a 2 , . . .] < [b , bx, b 2 , . . .] if and only if a < b , or a = b and 
ai > &i, or a = b , a\ = b\, and a 2 < b 2 , etc. This corresponds to inverting 
the order of the elements being compared on the odd indexes. Provided the 
sequence {jpi)i>o is /c-automatic, the result of Theorem [6] still holds. 

Corollary 7 Let (ipi)i>o be a k-automatic sequence of permutations, and let 
(oi)j>o be a k-automatic sequence. Then the lexicographically least sequence in 
the orbit closure, as modified by the permutations (ipi), is k-automatic. 

Proof. In the construction of Theorem [HI when we compare Ol + r to aj + R, 
we instead compare / 0i?( a L+i?) to ipR^aj+R). Since ('0i)i>o is ^-automatic, there 
is no problem computing %pR on input R. ■ 

From now on, when we talk about a continued fraction expansion [a ,ai, . . .] 
being fc-automatic, we mean the continued fraction has bounded partial quo- 
tients and the underlying sequence of partial quotients (dj)j>o is /c-automatic. 

Let T(x) be the usual transformation on continued fractions defined by T(x) = 
gj^j , so that T([ao, a%, a 2 , . . .]) = [a±, a 2 , . . .]. Thus we have 

Theorem 8 Let x be an irrational real number with a k-automatic continued 
fraction expansion [ao,ai, • • •]■ Then the continued fraction expansions of both 
liminfn^oo T n (x) and limsup^^ T n (x) are k-automatic. 

Proof. Use Corollary [TJ where the permutations invert the order of the 
letters on every other index. ■ 

In addition to the orbit closure of a sequence, we can study a related structure, 
which we call the reverse orbit closure. We say that a sequence b = (b n ) n > is 
in the reverse orbit closure of a = (a n ) n >o if every finite prefix of b is a prefix 
of some word of the form a r a r _\a r _ 2 ■ ■ ■ aia . 
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Theorem 9 // a = (a n ) n >o is k- automatic, then so is the lexicographically 
least sequence in the reverse orbit closure. 



Proof. Let b = (& n )n>o be the lexicographically least sequence in the reverse 
orbit closure of a = (a„) n > . We use the following characterization of b: 6j = c 
if and only if there exists r > i such that a r _j = c and a s a s _i ■ ■ ■ a s _i > 
a r a r _\ ■ ■ ■ a r _i for all s > i. 

We can now implement this test in exactly the same way that we implemented 
the test in the proof of Theorem O ■ 



We can also combine the reverse orbit closure with a permutation that inverts 
the order of the letters on every other index. 

Theorem 10 Let a be an irrational real number with a k-automatic continued 
fraction expansion [a , a 1; a 2 , . . .] . Let p n /<ln be the n 7 th convergent to the con- 
tinued fraction to a. Let j3 = liminf^ooPn/pn-i and 7 = liminf^oo q n /q n -i, 
5 = limsup^^ p n /p n -i, ( = limsup^^ q n /q n -i- Then the continued fraction 
expansion of each of ft, J, 5, ( is k-automatic. 



Proof. We prove the result for /3, the others being similar. By a famous 
result of Galois [TB] we have 

— — = [a n , a n _i, . . . , ao]. 

Pn-l 

Now f3 corresponds to the lexicographically least sequence in the reverse orbit 
closure of (aj)i>o, except that the ordering is slightly different from the usual 
ordering, where the ordering is as usual on the even indexed terms and opposite 
on the odd-indexed terms. As in Corollary we can handle this in the same 
way. ■ 



Example. Let us consider an example. As is well-known [361139] . for integers 
k > 3 the real number 

a k = Y, k~ T = [0, k - 1, k + 2, k, k, k - 2, k, k + 2, k, k - 2, k + 2, k, k - 1, . . .] 

i>0 

has a 2-automatic continued fraction expansion, generated by the automaton 
given in Figure H] (again, the automaton expects the least significant digit 
first). 
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Fig. 4. Automaton generating the continued fraction for a& 




Fig. 5. Automaton generating the continued fraction for (k 



Then ( k = limsup n > q n /q n -i = [k + 2, k - 2, k, k + 2, k, k - 2, k, k, . . .} is 
2-automatic. ■ 
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Let a be an irrational number with partial quotients Pn/q-n- The quantity 
( = limsup n>0 q n /q n _i figures in a number of recent papers in combinatorics 
on words. For example, 2 + ( is the value of the recurrence quotient of a 
Sturmian word with slope a [HH]. Hence this recurrence quotient has a k- 
automatic continued fraction if a does. 

The number ( also appears (actually, £ + 1) as the irrationality measure of 
numbers of the form (b — 1) Xm>i &~' nQ l [T]. 

Finally, ( also appears in a formula giving the critical exponent (aka "index" ) 
of Sturmian words, as found by Damanik and Lenz [161 Thm. 1, p. 24] and 
Cao and Wen [121 Thm. 9, p. 380]. This exponent is essentially 

C '■= 2 + limsup — . 

n>l q n -i 

If the limsup is actually attained for a particular n, then the critical exponent 
is rational. Otherwise it clearly coincides with 2 + (, and its continued fraction 
expansion is fc-automatic if that of a is. 



6 Applications 

Our results about the lexicographically least and largest sequences in the orbit 
closure of a sequence can be illustrated by and applied to two families of 
binary sequences: the sequences in the set T described below and the Sturmian 
sequences. 

6. 1 Sequences in the set Y 

Theorem [6] can be applied to shed some light on the automatic sequences that 
belong to two sets of binary sequences: the set Y occurring in the study of 
iterations of continuous unimodal maps of the interval (see [3|2] ) and the set 
Tstrict occurring in the study of unique /^-expansions of the number 1 [TTf2"2"f4"] . 
where 

T := {A E {0, 1} W : VA; > 0, A < a k A < A} 

Tstrict := {A G {0, 1} W : VA; > 1, A < o k A < A}. 

Here A = (a„)„>o, and a is the shift on sequences defined by a A := (a„+i)„>o- 
The bar operation replaces 0's by l's and l's by 0's, i.e., A := (1 — a n ) n > . 
Note that these two sets differ only by a set of (purely) periodic sequences. 
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Also note that the set V above differs slightly from the set T in [2], in that the 
set T above contains the extra sequence (10)^. 

The shifted Thue- Morse sequence is an element of T, as are more general 
automatic sequences (e.g., analogues of the Thue- Morse sequence including 
the g-mirror sequences introduced in [3|2] ; see [23ll24|40|[32|6] ) . 

Now for any binary sequence A belonging to T, define, as in [3|2] , 

T A := {B e {0, iy : VA; > 0, A<a k B < A}. 

Of course, the sequence A belongs to T^. Furthermore l w belongs to T, and 
any binary sequence B belongs to T^. Thus, given B, it is interesting to look 
for the lexicographically least sequence A such that B belongs to T^. The 
answer is easy (see [21 pp. 37-38]): the least sequence A in T such that B 
belongs to is 

0(B) := sup({a k B : k > 0} U {a l B : £>0}). 

In particular for any sequence B, all sequences a k B and a e B belong to Tq^b), 
and 0(B) is the largest such sequence. 

Theorem [6] above shows that if B is automatic, then so is 0(B). This remark 
is a small step in the study of all automatic sequences belonging to T. Note 
that T is not countable (see e.g., [21 Prop. 3, p. 35]), so that T also contains 
sequences that are not automatic. Even more, T contains sequences whose 
subword complexity is not 0(n): it suffices to take the sequence 0(B), where 
B is, as in [19], a binary minimal sequence with positive topological entropy, 
hence with subword complexity not of the form 0(n). 

6.2 Sturmian sequences 

We suppose that the reader is familiar with the notion of Sturmian sequence 
(see, e.g., [301 Chapter 2]). A result on characteristic Sturmian sequences and 
Sturmian sequences that was proved or partly proved several times (see the 
survey [7]) states that 

Theorem 11 

(a) A nonperiodic sequence A is characteristic Sturmian if and only if for any 
k > the following inequalities hold 

OA < a k A < 1 A. 
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(b) A nonperiodic binary sequence A is Sturmian if and only if there exists a 
binary sequence B such that for any k > the following inequalities hold 

OB < a k A < IB. 

Furthermore such a B is unique, and is the characteristic Sturmian sequence 
having the same slope as A. 

Theorem [IT] easily implies the following corollary. 

Corollary 12 The lexicographically least (resp. largest) sequence in the orbit 
closure of a Sturmian sequence A is the sequence OB (resp. IB) where B is 
the characteristic sequence with the same slope as A. 

Proof. It is not difficult to see that the inequalities above are optimal in 
the sense that, e.g., for a characteristic sequence A, we have OA = inf{a k A : 
k > 0} and similarly for the other three inequalities in Theorem [TT1 above. ■ 



7 Acknowledgments 

We thank Kalle Saari for his help with Finnish and for pointing out an error 
in a previous version. We thank the referees for a careful reading of the paper. 



References 

[1] B. Adamczewski and J. -P. Allouche. Reversals and palindromes in continued 
fractions. Theoret. Comput. Sci. 380 (2007), 220-237. 

[2] J. -P. Allouche. Theorie des Nombres et Automates. These d'Etat, 1983. Text 
available electronically at 

|http : //tel . archives-ouvertes . f r/tel-00343206/f r/ . 

[3] J. -P. Allouche and M. Cosnard. Iterations de fonctions unimodales et suites 
engendrees par automates. C. R. Acad. Sci. Paris 296 (1983), 159-162. 

[4] J. -P. Allouche and M. Cosnard. Non-integer bases, iteration of continuous real 
maps, and an arithmetic self-similar set. Acta Math. Hung. 91 (2001), 325-332. 

[5] J. -P. Allouche, J. Currie, and J. Shallit. Extremal infinite overlap-free 
binary words. Electronic J. Combinatorics 5(1) (1998), #R27 (electronic), 
http : / / www . combinatorics . org/Volume_5/ Abstract s/v5ilr27 . html 



17 



[6] J. -P. Allouche and C. Frougny. Univoque numbers and an avatar of Thue-Morse. 
Preprint, |http : //arxiv . org/abs/0712 . 0102[ 2007. Acta. Arith., to appear. 



[7] J. -P. Allouche and A. Glen. Extremal properties of (epi)sturmian sequences 
and distribution modulo 1. In preparation, 2008. 

[8] J. -P. Allouche and J. O. Shallit. The ubiquitous Prouhet-Thue-Morse sequence. 
In C. Ding, T. Helleseth, and H. Niederreiter, editors, Sequences and Their 
Applications, Proceedings of SETA '98, pp. 1-16. Springer- Verlag, 1999. 

[9] J. -P. Allouche and J. Shallit. Automatic Sequences: Theory, Applications, 
Generalizations. Cambridge University Press, 2003. 

[10] J. Berstel. Sur la construction de mots sans carre. Sera. Theor. N ombres 
Bordeaux, 1978-79, Expose 18, 18.01-18.15. 

[11] J. Berstel. Sur les mots sans carre definis par un morphisme. in H. A. Maurer, 
ed., Proc. 6th Int'l Conf. on Automata, Languages, and Programming, ICALP 
'79, Lect. Notes in Comp. Sci., Vol. 71, Springer- Verlag, 1979, pp. 16-25. 

[12] W.-T. Cao and Z.-Y. Wen. Some properties of the factors of Sturmian sequences. 
Theoret. Comput. Sci. 304 (2003), 365-385. 

[13] J. Cassaigne. An algorithm to test if a given circular HDOL-language avoids a 
pattern. Information processing '94, Vol. I, (Hamburg, 1994), IFIP Trans. A 
Comput. Sci. Tech. A-51, North-Holland, Amsterdam, 1994, pp. 459-464. DOI 
10.1.1.9.2125. 

[14] J. Cassaigne. Limit values of the recurrence quotient of Sturmian sequences. 
Theoret. Comput. Sci. 218 (1999), 3-12. 

[15] M. Crochemore. Sharp characterizations of squarefree morphisms. Theoret. 
Comput. Sci. 18 (1982), 221-226. 

[16] D. Damanik and D. Lenz. The index of Sturmian sequences. European J. 
Combinatorics 23 (2002), 23-29. 

[17] P. Erdos, I. Joo, and V. Komornik. Characterization of the unique expansions 
1 = ESi'T™ 1 , and related problems. Bull. Soc. Math. France 118 (1990), 
377-390. 

[18] E. Galois. Demonstration d'un theoreme sur les fractions continues periodiques. 
Ann. Math. Pures et Appl. 19 (1828-9), 294-301. 

[19] C. Grillenberger. Construction of strictly ergodic systems. Z. 
Wahrscheinlichkeitstheorie und Verw. Gebiete 25 (1973), 323-334. 

[20] J. Honkala. A decision method for the recognizability of sets defined by number 
systems. RAIRO Inform. Theor. App. 20 (1986), 395-403. 

[21] J. Karhumaki. On cube-free w-words generated by binary morphisms. Disc. 
Appl. Math. 5 (1983), 279-297. 



18 



[22] V. Komornik and P. Loreti. Unique developments in non-integer bases. Amer. 
Math. Monthly 105 (1998), 636-639. 

[23] V. Komornik and P. Loreti. Subexpansions, superexpansions and uniqueness 
properties in non-integer bases. Period. Math. Hungar. 44 (2002), 197-218. 

[24] V. Komornik and P. Loreti. On the topological structure of univoque sets. J. 
Number Theory 122 (2007), 157-183. 

[25] D. Krieger. On critical exponents in fixed points of fc-uniform binary 
morphisms. RAIRO-Theor. Inf. Appl. 43 (2009), 41-68. Corrigenda available at 
http : / / www . wisdom . weizmann . ac . il/~daliak/papers/Unif ormBinaryThesis . pdf 



[26] D. Krieger. On critical exponents in fixed points of non-erasing morphisms. 
Theoret. Comput. Sci. 376 (2007), 70-88. 

[27] M. Leconte. A characterization of power-free morphisms. Theoret. Comput. 
Sci. 38 (1985), 117-122. 

[28] S. Lehr. Sums and rational multiples of q-automatic sequences are g-automatic. 
Theoret. Comput. Sci. 108 (1993), 385-391. 

[29] J. Loftus, J. Shallit, and M.-w. Wang. New problems of pattern avoidance. In 
Developments in Language Theory (DLT '99), pp. 185-199. World Scientific, 
2000. 

[30] M. Lothaire. Algebraic Combinatorics on Words. Cambridge University Press, 
2002. 

[31] F. Mignosi and P. Seebold. If a D0L language is A:-power free then it is circular. 
In A. Lingas, R. Karlsson, and S. Carlsson, eds., Proc. 20th Int'l Conf. on 
Automata, Languages, and Programming, ICALP '93, Lect. Notes, in Comp. 
Sci., Vol. 700, Springer- Verlag, 1993, pp. 507-518. 

[32] M. Niu and Z.-x. Wen. A property of m-tuplings Morse sequence. Wuhan Univ. 
J. Nat. Sci. 11 (2006), 473-476. 

[33] N. Rampersad and J. Shallit. Words avoiding reversed subwords. J. Combin. 
Math. Combin. Comput. 54 (2005), 157-164. 

[34] D. Raymond and D. Wood. Grail: a C++ library for automata and expressions. 
J. Symbolic Comput. 17 (1994), 341-350. 

[35] G. Richomme and F. Wlazinski. Existence of finite test-sets for /c-power-freeness 
of uniform morphisms. Disc. Appl. Math. 155 (2007), 2001-2016. 

[36] J. O. Shallit. Simple continued fractions for some irrational numbers. J. Number 
Theory 11 (1979), 209-217. 

[37] A. Thue. Uber unendliche Zeichenreihen. Norske vid. Selsk. Skr. Mat. Nat. Kl. 
7 (1906), 1-22. Reprinted in Selected Mathematical Papers of Axel Thue, T. 
Nagell, editor, Universitetsforlaget, Oslo, 1977, pp. 139-158. 



19 



[38] A. Thue. Uber die gegenseitige Lage gleicher Teile gewisser Zeichenreihen. 
Norske vid. Selsk. Skr. Mat. Nat. Kl. 1 (1912), 1-67. Reprinted in Selected 
Mathematical Papers of Axel Thue, T. Nagell, editor, Universitetsforlaget, Oslo, 
1977, pp. 413-478. 

[39] A. J. van der Poorten and J. O. Shallit. Folded continued fractions. J. Number 
Theory 40 (1992), 237-250. 

[40] M. de Vries and V. Komornik. Unique expansions of real numbers. Preprint, 
http://arxiv.org/abs/math/0609708v3, 2007. Adv. in Math., to appear. 



20 



